130 research outputs found
Discussion of ``2004 IMS Medallion Lecture: Local Rademacher complexities and oracle inequalities in risk minimization'' by V. Koltchinskii
Discussion of ``2004 IMS Medallion Lecture: Local Rademacher complexities and
oracle inequalities in risk minimization'' by V. Koltchinskii [arXiv:0708.0083]Comment: Published at http://dx.doi.org/10.1214/009053606000001064 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Estimation of matrices with row sparsity
An increasing number of applications is concerned with recovering a sparse
matrix from noisy observations. In this paper, we consider the setting where
each row of the unknown matrix is sparse. We establish minimax optimal rates of
convergence for estimating matrices with row sparsity. A major focus in the
present paper is on the derivation of lower bounds
Learning by mirror averaging
Given a finite collection of estimators or classifiers, we study the problem
of model selection type aggregation, that is, we construct a new estimator or
classifier, called aggregate, which is nearly as good as the best among them
with respect to a given risk criterion. We define our aggregate by a simple
recursive procedure which solves an auxiliary stochastic linear programming
problem related to the original nonlinear one and constitutes a special case of
the mirror averaging algorithm. We show that the aggregate satisfies sharp
oracle inequalities under some general assumptions. The results are applied to
several problems including regression, classification and density estimation.Comment: Published in at http://dx.doi.org/10.1214/07-AOS546 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Penalized maximum likelihood and semiparametric second-order efficiency
We consider the problem of estimation of a shift parameter of an unknown
symmetric function in Gaussian white noise. We introduce a notion of
semiparametric second-order efficiency and propose estimators that are
semiparametrically efficient and second-order efficient in our model. These
estimators are of a penalized maximum likelihood type with an appropriately
chosen penalty. We argue that second-order efficiency is crucial in
semiparametric problems since only the second-order terms in asymptotic
expansion for the risk account for the behavior of the ``nonparametric
component'' of a semiparametric procedure, and they are not dramatically
smaller than the first-order terms.Comment: Published at http://dx.doi.org/10.1214/009053605000000895 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Variable selection with Hamming loss
We derive non-asymptotic bounds for the minimax risk of variable selection
under expected Hamming loss in the Gaussian mean model in for
classes of -sparse vectors separated from 0 by a constant . In some
cases, we get exact expressions for the nonasymptotic minimax risk as a
function of and find explicitly the minimax selectors. These results
are extended to dependent or non-Gaussian observations and to the problem of
crowdsourcing. Analogous conclusions are obtained for the probability of wrong
recovery of the sparsity pattern. As corollaries, we derive necessary and
sufficient conditions for such asymptotic properties as almost full recovery
and exact recovery. Moreover, we propose data-driven selectors that provide
almost full and exact recovery adaptively to the parameters of the classes
Aggregation by exponential weighting, sharp PAC-Bayesian bounds and sparsity
We study the problem of aggregation under the squared loss in the model of
regression with deterministic design. We obtain sharp PAC-Bayesian risk bounds
for aggregates defined via exponential weights, under general assumptions on
the distribution of errors and on the functions to aggregate. We then apply
these results to derive sparsity oracle inequalities
Estimating Mutual Information
We present two classes of improved estimators for mutual information
, from samples of random points distributed according to some joint
probability density . In contrast to conventional estimators based on
binnings, they are based on entropy estimates from -nearest neighbour
distances. This means that they are data efficient (with we resolve
structures down to the smallest possible scales), adaptive (the resolution is
higher where data are more numerous), and have minimal bias. Indeed, the bias
of the underlying entropy estimates is mainly due to non-uniformity of the
density at the smallest resolved scale, giving typically systematic errors
which scale as functions of for points. Numerically, we find that
both families become {\it exact} for independent distributions, i.e. the
estimator vanishes (up to statistical fluctuations) if . This holds for all tested marginal distributions and for all
dimensions of and . In addition, we give estimators for redundancies
between more than 2 random variables. We compare our algorithms in detail with
existing algorithms. Finally, we demonstrate the usefulness of our estimators
for assessing the actual independence of components obtained from independent
component analysis (ICA), for improving ICA, and for estimating the reliability
of blind source separation.Comment: 16 pages, including 18 figure
Regularization of statistical inverse problems and the Bakushinskii veto
In the deterministic context Bakushinskii's theorem excludes the existence of
purely data driven convergent regularization for ill-posed problems. We will
prove in the present work that in the statistical setting we can either
construct a counter example or develop an equivalent formulation depending on
the considered class of probability distributions. Hence, Bakushinskii's
theorem does not generalize to the statistical context, although this has often
been assumed in the past. To arrive at this conclusion, we will deduce from the
classic theory new concepts for a general study of statistical inverse problems
and perform a systematic clarification of the key ideas of statistical
regularization.Comment: 20 page
Iteratively regularized Newton-type methods for general data misfit functionals and applications to Poisson data
We study Newton type methods for inverse problems described by nonlinear
operator equations in Banach spaces where the Newton equations
are regularized variationally using a general
data misfit functional and a convex regularization term. This generalizes the
well-known iteratively regularized Gauss-Newton method (IRGNM). We prove
convergence and convergence rates as the noise level tends to 0 both for an a
priori stopping rule and for a Lepski{\u\i}-type a posteriori stopping rule.
Our analysis includes previous order optimal convergence rate results for the
IRGNM as special cases. The main focus of this paper is on inverse problems
with Poisson data where the natural data misfit functional is given by the
Kullback-Leibler divergence. Two examples of such problems are discussed in
detail: an inverse obstacle scattering problem with amplitude data of the
far-field pattern and a phase retrieval problem. The performence of the
proposed method for these problems is illustrated in numerical examples
- …